APR-2S-20O4 10:03 AM 



ROBROY FAWCETT 



1 T60 738 T00S 



P - 0 



Claim Listing: 

1. (currently amended) Method for generating fecial animation values using a sequence 
of facial image frames and synchronously captured audio data of a speaking actor, comprising the 
steps for; 

providing a plurality of visual-facial-animation values based on tracking without using 
markers attached to the actor* s face, o f facial features in the sequence of facial image frames of 
the speaking actor; 

providing a plurality of audio-facial-animation values based on visemes detected using 
the synchronously captured audio voice data of the speaking actor; and 

combining the plurality of visual facial animation values and the plurality of audio facial 
animation values to generate output facial animation values for use in fecial animation. 

2. (original) Method for generating facial animation values as defined in claim 1, 
wherein the output facial animation values associated with a mouth for a fecial animation are 
based only on the respective mouth-associated values of the plurality of audio facial animation 
values. 



3. (original) Method for generating facial animation values as defined in claim 1 , 
wherein the output facial animation values associated with a mouth for a facial animation are 
based only on the respective mouth-associated values of the plurality of audio facial animation 
values. 
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4. (original) Method for generating facial animation values as defined in claim 1, 
wherein the output facial animation values associated with a mouth for a facial animation are 
based on Kalman filtering of the respective mouth-associated values of the plurality of visual 
facial animation values and the respective mouth-associated values of the plurality of audio facial 
animation values. 



5, (original) Method for generating facial animation values as defined in claim 1, 
wherein the step of combining the plurality of visual facial animation values and the plurality of 
audio facial animation values to generate output facial animation values includes detecting 
whether speech is occurring in the synchronously captured audio voice data of the speaking actor 
and, while speech is detected as occurring, generating the output facial animation values 
associated with a mouth based only on the respective mouth-associated values of the plurality of 
audio facial animation values and, while speech is not detected as occurring, generating the 
output facial animation values associated with a mouth based only on the respective mouth- 
associated values of the plurality of visual facial animation values. 



6. (original) Method for generating facial animation values as defined in claim 1, 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using bunch graph matching. 

7, (original) Method for generating facial animation values as defined in claim 1, 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using transformed facial image frames generated based on wavelet 
transformations. 
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8. (original) Method for generating facial animation values as defined in claim 1, 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using transformed facial image frames generated based on Gabor wavelet 
transformations. 

9. (currently amended) Apparatus for generating fecial animation values using a 
sequence of facial image frames and synchronously captured audio data of a speaking actor, 
comprising: 

means for providing a plurality of visual-facial-animation values based on trackings 
without using markers attached to the s peakin g actor's face, of facial features in the sequence of 
facial image frames of the speaking actor; 

means for providing a plurality of audio-facial-animation values based on visemes 
detected using the synchronously captured audio voice data of the speaking actor; and 



means for providing a plurality of visual-facial-animation values based on tracking of 
facial features in the sequence of facial image frames of the speaking actor; 

means for combining the plurality of visual facial animation values and the plurality of 
audio facial animation values to generate output facial animation values for use in facial 
animation. 

10. (original) Apparatus for generating facial animation values as defined in claim 9, 
wherein the output facial animation values associated with a mouth for a facial animation are 
based only on the respective mouth-associated values of the plurality of audio facial animation 
values. 
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1 1 . (original) Apparatus for generating facial animation values as defined in claim 9, 
wherein the output facial animation values associated with a mouth for a facial animation are 
based on a weighted average of the respective mouth-associated values of the plurality of visual 
facial animation values and the respective mouth-associated valued of the plurality of audio facial 
animation values. 

12. (original) Apparatus for generating facial animation values as defined in claim 9, 
wherein the output facial animation values associated with a mouth for a facial animation are 
based on Kalman filtering of the respective mouth-associated values of the plurality of visual 
facial animation values and the respective mouth-associated values of the plurality of audio facial 
animation values. 



13. (original) Apparatus for generating facial animation values as defined in claim 9, 
wherein the means for combining the plurality of visual facial animation values and the plurality 
of audio facial animation values to generate output facial animation values includes means for 
detecting whether speech is occurring in the synchronously captured audio voice data of the 
speaking actor and, while speech is detected as occurring, generating the output facial animation 
values associated with a mouth based only on the respective mouth-associated values of the 
plurality of audio facial animation values and, while speech is not detected as occurring, 
generating the output facial animation values associated with a mouth based only on the 
respective mouth-associated values of the plurality of visual facial animation values. 

14. (original) Apparatus for generating facial animation values as defined in claim 9, 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using bunch graph matching. 
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15. (original) Apparatus for generating facial animation values as defined in claim 9 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using transformed facial image frames generated based on wavelet 
transformations* 



16. (original) Apparatus for generating fecial animation values as defined in claim 9, 
wherein the tracking of facial features in the sequence of facial image frames of the speaking 
actor is performed using transformed facial image frames generated based on Gabor wavelet 
transformations. 
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